Probability and Statistics: The Science of Uncertainty: From Probability to Likelihood: The Science of Inference

Statistical inference marks the transition from predicting outcomes based on known parameters (Probability) to determining which parameters are most consistent with observed data (Likelihood). While a probability density function $f(x|\theta)$ describes the distribution of data $x$ for a fixed $\theta$, the Likelihood function $L(\theta|x)$ treats the observed data as fixed and varies the parameter $\theta$ to quantify the relative support for different hypotheses.

The Inversion Principle

The likelihood function is often expressed in the form of the joint density. For a Normal distribution with fixed variance, the likelihood is defined by:

$L ( \theta | x_1, \dots, x_n ) = \exp\left( -\frac{n}{2\sigma_0^2} (\bar{x} - \theta)^2 \right)$

Here, we evaluate the "plausibility" of different $\theta$ values given the sample mean $\bar{x}$. To find the peak of this plausibility, we utilize Definition 6.2.2: the log-likelihood $l(\theta | s) = \ln L(\theta | s)$. This transformation simplifies products of independent observations into sums, making the maximization of complex models computationally feasible.

Worked Example: The Height Survey (EXAMPLE 6.3.5)

The Data

Consider a sample of $n=30$ heights with a calculated standard deviation of $s=2.379$. Using the Location-Scale Normal Model, we seek to infer the true mean $\theta$.

Inference & Precision

The standard error is calculated as $s/\sqrt{30} = 0.43434$. This value measures the "sharpness" of our likelihood peak. A smaller standard error implies a narrower, sharper peak, representing higher precision in our inference about $\theta$.

Dimensionality and Constraints

In complex scenarios like EXAMPLE 6.1.5 (Multinomial Models), we must account for logical dependencies. As noted, "Notice that it is really only two-dimensional, because as soon as we know the value of any two of the $\theta_i$'s... we immediately know the value of the remaining parameter." This constraint is vital for correctly defining the parameter space $\Omega$.

Asymptotic Foundations

The bridge from likelihood to inference relies on the Central Limit Theorem. As $n \to \infty$, the distribution of our estimators converges. Specifically, in the EXAMPLE 6.5.4 Bernoulli Model:

$Z = \frac{\sqrt{n}(\bar{X} - \theta)}{\sqrt{\bar{X}(1 - \bar{X})}} \xrightarrow{D} N(0, 1)$

This allows us to quantify uncertainty using z-intervals and p-values, provided we have sufficiently large samples.

🎯 Core Principle

Distribution-free methods of statistical inference require only minimal assumptions about the sampling distribution, making them robust when the family $\{P_{\theta} : \theta \in \Omega\}$ is very large. In contrast, parametric likelihood methods rely on the curvature of the log-likelihood, where the Fisher Information $nI(\theta)$ determines the variance of our score function.

QUESTION 1

6.1.2: Suppose suicides occur at rate $p$ per person year (Poisson(Np)). If we observe 22 suicides in $N=30,345$ person years, what is the log-likelihood function $l(p)$?

$l(p) = -30345p + 22\ln(p) + C$

$l(p) = 30345\ln(p) - 22p + C$

$l(p) = e^{-30345p} p^{22}$

$l(p) = -22p + 30345\ln(p)$

QUESTION 2

6.3.14: A 0.95-confidence interval for $\psi(\theta)$ is $(1.23, 2.45)$. Is there evidence against $H_0 : \psi(\theta) = 2$?

No, because 2 is inside the interval.

Yes, because 2 is not the center of the interval.

Yes, at the $\alpha=0.01$ level.

Insufficient information to conclude.

QUESTION 3

Verify the 3rd moment of $N(\mu, \sigma^2)$. Which expression represents $\mu_3 = E_{\theta}(X^3)$?

$\mu^3 + 3\mu\sigma^2$

$\mu^3 + \sigma^3$

$3\mu^2\sigma + \mu^3$

$\mu^3 + 3\sigma^2$

QUESTION 4

6.5.1: If $x_1, \dots, x_n \sim N(\mu_0, \sigma^2)$ with known $\mu_0$, what is the Fisher information $I(\sigma^2)$?

$1 / (2\sigma^4)$

$1 / \sigma^2$

$n / (2\sigma^2)$

$2\sigma^4$

QUESTION 5

In a k-category Multinomial model (EXAMPLE 6.1.5), what is the effective dimensionality of the parameter space?

$k - 1$

$k$

$k^2$

$1$